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Abstract — We present a simultaneous generalization of the 
well-known Karhunen-Loeve (PCA) and fc-means algorithms. 
The basic idea lies in approximating the data with k affine 
subspaces of a given dimension n. In the case n = 0we obtain 
the classical fc-means, while for fc = 1 we obtain PCA algorithm. 

We show that for some data exploration problems this method 
gives better result then either of the classical approaches. 

Index Terms — Karhunen-Loeve Transform, PCA, k-Means, 
optimization, compression, data compression, image compression. 



I. Introduction 

OUR general problem concerns splitting of a given data- 
set W into clusters with respect to their intrinsic di- 
mensionality. The motivation to create such an algorithm is a 
desire to extract parts of data which can be easily described 
by a smaller number of parameters. More precisely, we want 
to find affine spaces Si, . . . , Sk such that every element of 
W belongs (with certain maximal error) to one of the spaces 
Si, ... , Sk- 

To explain it graphically, let us consider following example. 
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Figure 1(a) represents three lines in the plane, while Figure 



1(b) a circle and an orthogonal line in the space. Our goal is 
to construct an algorithm that will split them into three lines 
and into a line and a circle. 
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Fig. 1. Our goal is to create algorithm which will interpret Fig. 1 1(a) j as three 
groups of one-dimensional points and Fig. 1 1 (b)| as two groups of one- and 
two-dimensional points. 

We have constructed a simultaneous generalization of the fc- 
means method 1 1 ] and the Karhunen-Loeve transform (called 
also PCA - Principle Component Analysis) |2j - we call it 
(w, fc)-means. Instead of finding fc centers which best represent 
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Fig. 2. Example of clustering: Fig. |2(a)| for fc = 2 clusters which are 1- 
dimensional; Fig. |2(b)|for classical fc-means with fc = 2 clusters. 



the data as in the classical fc-means, we find fc nearest 
subspaces of a given fixed dimension n, where u denotes the 
weight which takes part in measuring the distance. In analogy 
to the case of fc-means, we obtain a version of the Voronoi 
diagram (for details see next section). In the simplest form our 
algorithm needs the number of clusters fc and the dimension n 
(for n — we obtain the fc-means while for fc = 1 we obtain 
the PCA). 

To present our method, consider the points grouped along 
two parallel lines - Figure |2]presents the result of our program 
on the clustering of this set. 

The approach can be clearly used in most standard applica- 
tions of either the fc-means or the Karhunen-Loeve transform. 
In particular, (since it is a generalization of the Karhunen- 
Loeve transform [3|) one of the possible natural applications 
of our method lies in the image compression. Figure |3]presents 
erroiQ in image reconstruction of a classical Lena photo 
(508 x 508 pixels) as a function of fc. Observe that just by 
modifying the number of clusters from 1 to 3, which makes 
the minimal increase in the necessary memory, we decrease 
twice the level of error in the compression. 

Except for image compression our method can by applied in 
various situations where the classical fc-means or PCA where 
used, for example in: 

• data mining - we can detect important coordinates and 
subsets with similar properties; 

• clustering - our modification of fc-means can detect 
different, high dimensional relation in data; 

• image compression and image segmentation; 

• pattern recognition - thanks to detection of relation in 
data we can use it to assign data to defined before classes. 

The basic idea of (to, fc)-means algorithm can be described 
as follows: 

choose 

initial clusters distribution 
repeat 

1 By error in image comparison we understand pixel by pixel image compare 
using standard Euclidean norm. 
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Fig. 3. Error in image decompression as a function of number of clusters k 
for n = 5. 





(a) (b) 
Fig. 4. Graphical presentation of B(p, q), D(p, q) and D(p, S) in R 2 . 



apply the Karhunen-Loeve method for each cluster 
appoint new clusters 
until decrease of "energy" is below given error 

II. Generalized Voronoi Diagram 

The Voronoi diagram is one of the most useful data struc- 
tures in computational geometry, with applications in many 
areas of science |4|. For the convenience of the reader and to 
establish the notation we shortly describe the classical version 
of the Voronoi diagram (for more details see pi). For N E N 
consider R N with the standard Euclidean distance and let S 
be a finite set of R N . For p,q G S such, that p ^ q, let 



B(p, q) = {ze 



D(p,q) = {zeR N : \\p-z\\ < \\q-z\\}. 



\\p- z \\ = h- z \\}, 



(i) 

(2) 



Hyperplane B(p,q) divides K" into two sets, one containing 
points which are closer to point p then q (D(p,q)), and the 
second one containing points which are closer to point q then 

p (D(q,p)) - see Figure 4(a) 



Definition 2.1 ( f^): The set 



D(p,S) :-- 



n dm 

q€S: q^p 



of all points that are closer to p than to any other element of 
S is called the (open) Voronoi region of p with respect to S. 
For N — 2 set D(p, S) is the interior of a convex, possibly 



The points on the contour of D(p, S) are those that have 
more than one nearest neighbor in S, one of which is p. 
Definition 2.2 ( [5]): The union 

V(S) :={JdD(p, S) 

of all region boundaries is called the Voronoi diagram of S. 

The common boundary of two Voronoi regions is a Voronoi 
edge. Two edges meet at a Voronoi vertex such a point has 
three or more nearest neighbors in the set S. 

Now we proceed to the description of our modification of 
the Voronoi diagram. We divide the space R N with respect to 
affine subspaces of M. N . 

Definition 2.3: For n < N let 

E n (R N ) := {(v , • ■ • , v„) G (R N ) n+1 such that 

v,,Vj are orthonormal for i,j > 0,i ^ j}. 

Thus v denotes a center of affine space we consider, 
while vi , . . . , v n is the orthonormal base of its "vector 
part". From the geometrical point of view the element v = 
(vq, vi, . . . , v„) € E„(IR Ar ) represents the affine space 

v + lin(vi, . . . , v„) = aff(vo,vi, . . . , v„). 

We modify equations ([T]) and Q, by using distance between 
a point and affine subspace generated by linear independence 
vector instead of distance between points. 

Definition 2.4: Let n < N and let v e E^M^), u = 

n 

(u>q, . . . ,oj n ) E [0, such that 2~2 u j = ^ t> e g iVen - F° r 



X G 



pJV 



3=0 



let 
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DIST w (x;v) := dist(x; aff (vo, 



,v,)) 2 



(3) 

where dist(x; V) denotes the distance of the point x from the 
space V. 

In formula (|3), to — (w , ■ ■ ■ , w„) is interpreted as vector of 
weights, where Wfe denotes the weight of the affine subspace 
of dimension k. It is easy to notice, that DIST has following 
properties: 

. for v G EnQR^) and tj = (0, . . . , 0, 1) G [0, we 
obtain that DIST w (x;v) is a distance between the point 
x and affine space aff(v); 

• if vo = and u = (0, . . . , 0, 1) then DIST^ is a distance 
between point and linear space generated by (vi, . . . , v n ); 

• if oj = (1, 0, . . . , 0) then DIST W is the classical distance 
between x and vo: 

DIST w (x;v) = ||z-v ||. 

Remark 2.5: Formula Q can be computed as follows 



(DIST u (x;v))' 



■■^Ulj I 1 1 — V 1 1 2 
3=0 \ 
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unbounded polygon (Figure 4(b) I. 



3=0 



Uj\\x - vol] 5 



3=0 



(x - v ; v 4 ) 2 J 
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^2(x-v ;v z ) 2 
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(a) two lines (b) three lines (c) four lines 



Fig. 5. Generalized Voronoi diagram for w = (0, 1) and two, three and four 
lines on plane. 

To optimize calculations we define 

Vl = (x-VthVx) 2 , 

= Vj_i + (a; - v ; v,-) 2 , 

and since w j — 1 tnus we simplify our computation to 

n 

(DIST^x; v)) 2 =110: - v || 2 

3=0 

Now we are ready to define our generalization of the 
Voronoi diagram. Let S be a finite subset of Ei n (M. N ) and 
lu G [0, J2uj = 1, where n < N. For p, q G S such, 

that p 7^ q, let 

B w (p,q) := {z G : DIST^z; p) = DIST„(z; q)}, 

£Up,q) := {z G l w : DIST w (z;p) < DIST w (z; q)}. 

The set B u (p,q) divides the space M. N into two sets, first 
containing points which are closer to p then to q (£> w (p,q)) 
and second contain points which are closer to q then p 

(A,(q,p))- 

Definition 2.6: Let n G N, n < N be fixed. Let 5 be finite 
subset of E„(M Ar ) and w G [0, Ej=o w j = 1 be S iven - 
For p G S the set 

£> u (p,S):= f| £> w (p,q) 

qGS: q#p 

of all points that are closer to p than to any other element of 
S is called the (open) generalized Voronoi region of p with 
respect to S. 

Applying this definition we obtain a new type of Voronoi 
diagram. As we can see in Figure [5]ifw = (0,1) we divide the 
plane into (not necessarily convex) polygons (similar situation 
to the classical Voronoi diagram). Figure [6] presents a gen- 
eralized diagram on the plane for different weights changing 
from to = (1, 0) to u> = (0, 1). In general we obtain that the 
boundary sets usually are not polygons but zeros of quadratic 
polynomials. The same happens in R 3 even for u = (0, 1) see 
the Figure [7] where we show points with equal distance from 
two lines. 

III. Generalization of the A;-means method 

Clustering is a classical problem of the division of the set 
S C R N into separate clusters, or in other words, into sets 
showing given type of behavior. 
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Fig. 7. Generalized Voronoi diagram for lu = (0, 1) and two lines. 
A. k-means 

One of the most popular and basic method of clustering 
is the fc-means algorithm. By this approach we want to 
divide S into k clusters Si , . . . , Sk with minimal energy. For 
convenience of the reader and to establish the notation we 
shortly present the fc-means algorithm. 

For a cluster S and r G we define 

E(S,r) :=]T||,s-r|| 2 . 

sSS 

The function E(S,r) is often interpreted as an energy. We 
say that the point r best "describe" the set S if the energy is 
minimal, more precisely, if 

E(S,r) = inf {E(S»}. 

r6R« 

It is easy to show that barycenter (mean) of S minimizes the 
function E(5, •) (for more information see |6|, |7|). The above 
consideration can be precisely formulated as follows: 

Theorem 3.1 (k-means): Let S be a finite subset of Mr. 
We have 

E(5,m(S))= mf {E(S,r)} 
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where fi(S) := cai 1 ds 2^2ses s denotes the barycenter of S. 
Thus in the A; -means the goal is to find such clustering S 
5i U . . . U S k that the function 



E(S u ...,S k ) 



k 

E 



E(5 i)M (^)) 



is minimal, fc-means algorithm for the set S after S. Lloyd 
||&)~p0) proceeds as follows: 



stop condition 

choose e > 
initial conditions 

choose randomly points {s\, . . . ,s k } C S 
obtain first clustering (S\, . . . , S k ) by matching each of 
the point s G 5 to the cluster 5, specified by such 
that || s — Sj|| 2 is minimal 
repeat 

letE = E(S 1 ,...,S fc ) 

compute new points S\ , . . . , s k which best "describe" 
the clusters (Sj — fi(Sj) for j = 1, . . . , k) 
obtain new clustering (Si, . . . , S k ) by adding each of 
the points s G 5* to the cluster such that ||s — Sj|| 2 is 
minimal 
until E- E(S 1 ,...,S k ) < e 

Lloyd's method guarantees a decrease in each iteration but 
does not guarantee that the result will be optimal. 

B. (cj, k)-means 

In this chapter we consider generalization of fc-means 
similar to that from the previous section concerning the 
Voronoi diagram. Instead of looking for the points which best 
"describe" clusters we seek n dimensional subspaces of M. N . 

Let S C M. N and uj G [0, l] n+1 , £wj = 1 be fixed - For 
v G E„(]R Ar ) let 



E u (S,v) :=£)DIST»(*,v). 



sGS 

We interpret the function E W (S, v) as an energy of the set S 
respectively to the subspace generated by v. If the energy is 
zero, the set S is subset of affine space generated by v. We 
say that v best "describes" the set S if the energy is minimal, 
more precisely if 

E w (S,v) = inf {E w (S,v)}. 

To obtain an optimal base we use a classical Karhunen-Loeve 
transform (called also Principal Component Analysis, shortly 
PCA), see \2\. The basic idea behind the PC A is to find the 
coordinate system in which the first few coordinates give us 
a "largest" possible information about our data. 

Theorem 3.2 (PCA): Let S = {si,...,s m } be a finite 
subset of R N . Let 



M(S) := (v 0> . 

be such that 
• v = n(S); 



.,v N )€E N (R N ) 



• vi, . . . , vjy are pairwise orthogonal eigenvectors of [si — 
v , . . . , s m - v ] • [si - v , . . . , s m - v ] T arranged in 
descending order (according to the eigenvalues^ 

For every n < N and lj E [0, we have 

E LU (S,M k (S))= inf {E w (5,v)}, 

vgE„'~ ' 



1, in (w, fc)-means our 



where M k (S) := (v , . . .,v fe ). 
Thus given uj e [0, 
goal is to find such clustering S = Si U . . . U S k that the 
function 

k 

E u (Si, ... , S k ) :=J2^(S^M n (S)) (4) 
i=i 

is minimal. Consequently (us, fc)-means algorithm can be de- 
scribed as follows: 

stop condition 

choose e > 
initial conditions 

choose randomly points {si, . . . ,s k } C S 
obtain first clustering (Si, . . . , S k ) by matching each of 
the points s € S to the cluster such that \\s — Sj\\ 2 is 
minimal 
repeat 

letE = E w (5i,...,5 fc ) 

compute vectors vi, . . . , Vfe, which best "describe" the 

clusters, by the PCA method (Vj = M n (Sj)) 
obtain new clustering (Si, . . . , S k ) by adding each of 
the point s G S to the cluster such that DIST^SjVj) 
is minimal 
until E-E w (Si,...,S fe ) <e 

As is the case in the classical /c-means, our algorithm 
guarantees a decrease in each iteration but does not guarantee 
that the result will be optimal (cf. Example [33}. 

Example 3.3: As already mentioned in Section 2, the k- 
means do not find a global minimum and strongly depends 
on initial selection of clusters. In our case, this effect can be 
even more visible. Consider the case of circle C in R 2 with 



4 clusters and u) = (0, 1). The picture, see Figure 8(a) shows 
clustering obtained by use (u>, k) -means algorithm. Of course 
it is a local minimum of E^, however as we see at Figure 



8(b) it is far from being the global minimum. 

Initial cluster selection in our algorithm is the same as in 
fc-mean algorithm, but it is possible to consider others ways: 

• fc-means++ algorithm fTT) ; 

• starting from a given division (not from random distribu- 
tion); 

• repeating the initial choice of clusters many times. 
Each of above approaches usually solves the problem de- 
scribed in Example |3.3| 

Remark 3.4: Let S C and v G E„(IR A '). It is easy to 
notice that the above method has following properties: 

• for ui = (1, 0, ... , 0) we obtain the classical fc-means, 



vq] is a matrix with columns Sj — vrj, for j 



(b) global minimum 



(a) local minimum 



Fig. 8. Circle clustering in R 2 for 4 clusters with ui = (0, 1). (uj, fc)-means 
method strongly dependents on initial conditions. 



(a) (b) 
Fig. 10. Clustering with: Fig. |l(Xa)| - fe-means; Fig. [T0(b)l - (w,fc) -means. 
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Fig. 9. (cj, fc)-means method for clustering into 4 clusters of set 
r l 
I 1000 ' 1000 
vectors: Fig. 
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Fig. 6(b) w |); Fig. 6(a) nT=~i0, 1). 
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• for n = 1 we get Karhunen-Loeve transform. 
As an algorithm's outcome we get: 

• division of the data into clusters {Si, . . . , Sk}', 

• for each cluster an affine space of dimension n obtained 
by the Karhunen-Loeve method which best represents the 
given cluster. 

Example 3.5: If we apply our algorithm for regular plane 
subset (ex. square) we obtain generalized Voronoi diagram (cf. 
Fig. [6]) - Figure |9]present clustering for different weight vector 
changing from uj = (1, 0) to ui = (0, 1). 

IV. Applications 

A. Clustering 

Clustering, by (lj, fc)-means algorithm, gives a better de- 
scription of the internal geometry of a set, in particular it found 
a reasonable splitting into connected components of consider 
the points grouped along two parallel sections (see Figure [2}. 
Similar effect we can see in next example, when we consider 



the points grouped along circle and interval, see Figure 10 



Concluding, in many cases the (w, fc)-means method can be 
very useful in seeking n-dimensional (connected) components 
of given data sets. 



B. Analysis of Functions 

In this subsection we consider real data from acoustics. 
Acoustical engineers fl2) study reverberation which is ob- 
served when a sound is produced in an enclosed space causing 
a large number of echoes to build up and then slowly decay as 
the sound is absorbed by the walls and the air. Reverberation 
time is crucial for describing the acoustic quality of a room or 
space. It is the most important parameter for describing sound 
levels, speech intelligibility and the perception of music and 
is used to correct or normalize building acoustics and sound 
power measurements. 

We analyze the decay curve (see Figure 



ll(a)i which 



presents measurement of sound level in time and describe 
way in which sound impulse vanishing into background noise. 
Based on this we want to recover reverberation time. In 
particular we know that we have two linear component: first 
connected with sound absorption by the space and second 
- background noise. To use statistical analysis, we have to 
extract both of them. Our algorithm detects the n-dimensional 
subspaces, so for the parameters lu — (0, 1) we obtain linear 



approximation of function, see Figure |1 1(b) 



Results obtained using our algorithm are comparable with 
those obtained by classical methods and give more opportuni- 
ties for further research. 

C. Image compression 

Our algorithm can be used to compress images. First, we 
interpret photo as a matrix. We do this by dividing it into 8 
by 8 pixels, where each pixel is described (in RGB) by using 
3 parameters. Each of the pieces is presented as a vector from 
M 192 . By this operation we obtain dataset from ]R 192 . 

Taking into consideration the classical Lena picture (508 x 
508 pixels), let us present its compressed version with the 

k 



use of £;-means method (Figure 12(a) 



Karhunen-Loeve Transform (Figure 12(b) k 



0), 
1) 

and (lu, fc)-means algorithm (Figures 12(c) and 12(d) I. As 



5, n - 
= 1, n 



we can see the algorithm allows to reconstruct with great 
accuracy compressed images while reducing the amount of 
needed information to save (in our example we remember ex. 
only 5 coordinates in 192-dimensional space). 

Table [I] presents error in image reconstruction for Lena 
picture. We run (to, fc)-means algorithm 16 times and each 
run improve clustering quality 50 times. 




2 4 6 8 10 

t 



(b) 

Fig. 11. Linear c ompon ent of the data structure. Fig. 1 1 1 (a)] - decay curve 
(original data). Fig. 1 1 l(b)| - outcome from (ui, fc)-means algorithm for k = 2, 
oj = (0, 1) we extract two linear components in data (black dots match 
clusters centers with the corresponding lines describing those clusters, vertical 
line separate sound and background noise - after 4.3 s). 

TABLE I 

Error in image decompression for certain k and n. 



k 


n 





1 


2 


3 


4 


5 


1 


40328.6 


19499.3 


16358.1 


12452.1 


10160.5 


8149.7 


2 


27502.8 


17193.3 


13031.9 


10382.8 


9082.7 


7913.2 


3 


23261.2 


15437.2 


11631.8 


9612.0 


8350.1 


7358.3 


4 


20990.4 


14454.1 


11004.1 


9192.7 


7922.4 


7095.9 


5 


20150.1 


13740.0 


10602.4 


8867.2 


7745.9 


6814.5 



V. Implementation 

Sample implementation of (w, fc)-means algorithm prepared 
in Java programming language is available at http://www.ii.uj. 
edu.pl/~misztalk 
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(a) fc-means: k = 5, n = (b) PCA: k = 1, n = 1 




(c) (lo, fc)-means: k = 5, n = 1 (d) (lu, fc)-means: k = 5, n = 5 



Fig. 12. Compressed version of Lena picture. Subimage compare: Fig.|12(a)| 
- classical fe-means; Fig. 1 12(b)] - Karhunen-Loeve Transform; Fig. |12(c)| and 
Fig. 1 1 2(d)] {ui, fc)-means algorithm. 
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